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1.  Detailed  Description  of  Work 

1.1.  Introduction 

The  main  goal  of  this  effort  was  to  enable  lightweight  clients  (mobile  devices)  to 
outsource  computations  without  divulging  any  information  about  the  data  and  or  code  of 
the  application.  This  problem  is  known  to  be  very  difficult.  In  order  to  make  progress,  we 
had  to  attempt  to  solve  the  problem  under  some  restrictions  on  the  types  of  programs.  We 
considered  programs  that  have  loops  and  programs  that  do  not  have  loops.  For  both  types 
of  programs  we  restricted  our  study  on  programs  that  have  linear  assignments  and 
conditionals.  We  also  considered  the  problem  of  hiding  the  access  patterns  to  an 
encrypted  database. 

1.2.  Summary  of  Results 

The  following  are  the  main  results  that  we  have  achieved. 

o  Provable  data  hiding  and  tamper  resistance  for  a  simple  loop  program:  we 
developed  techniques  that  enable  provable  data  hiding  for  a  simple  loop  program, 
o  Data  and  Code  Hiding  for  non-iterative  piece- wise  linear  programs:  we  developed 
techniques  that  enable  data  and  code  hiding  for  non-iterative  piece-wise  linear 
programs,  but  we  were  unable  to  fully  quantify  the  achieved  degree  of  hiding, 
o  Data  hiding  for  piece- wise  linear  programs  with  linear  conditionals:  we  studied 
conditions  under  which 
o  Hiding  of  Traversal  of  tree-structured  data. 

In  the  following  sections,  we  describe  these  results  in  details. 
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1.3.  Code  Hiding  for  a  Simple  Loop  Program 

We  proposed  a  non-interactive  scheme  to  provide  data  hiding  for  a  program  of  the  form 
forlAoX  =  MX, 

where  X  is  a  complex  input  vector  over  Cn,  /  is  an  variable  integer  input,  and  M  is  a  n  x  « 
complex  matrix  (publication  [4]).  The  scheme  allows  a  client  to  encrypt  X  without 
leaking  any  information  about  it  to  the  server.  Unlike  other  proposed  schemes,  our 
scheme  allows  a  client  to  determine  efficiently  and  with  high  probability  whether  the 
results  returned  by  the  server  are  correct  or  not.  The  scheme  is  non-interactive;  the  client 
sends  only  one  message  to  the  server  and  the  server  sends  only  one  message  to  the  client. 
This  is  the  first  hiding  scheme  that  we  are  aware  of  that  apply  to  real  and  complex  inputs. 
Other  hiding  schemes  typically  assume  that  the  inputs  are  elements  of  Z/mZ  ,  (integers 
modulo  m,  where  /n  is  a  prime  number)  which  reduces  their  applicability.  While  our 
results  apply  to  a  simple  class  of  loop  programs,  the  techniques  we  develop  are 
interesting  on  their  own.  The  encryption  cost  of  the  scheme  is  of  the  order  n^  and  is 
independent  of  /.  The  execution  time  of  the  program  is  In^  or  T(«)lg  /,  where  T(«)  is  the 
execution  time  of  an  algorithm  to  square  the  matrix  M.  The  time  is  obtained  by 
applying  a  straightforward  execution  of  /  iteration  each  requiring  lu  operations.  The  time 
T(«)lg  /  is  obtained  by  calculating  M'  and  then  multiplying  the  result  with  X.  If  /  is  large, 
in  both  cases  there  are  large  saving  in  computation  for  the  client.  It  should  be  clear  that  a 
simple  way  to  hide  the  input  for  our  program  is  to  have  the  server  calculate  Mi  and  send 
the  result  to  the  client.  This  has  two  drawbacks.  First,  the  communication  complexity  is 
m  to  send  a  matrix  instead  of  n  to  send  a  vector.  Second,  it  is  not  clear  in  that  case  how 
the  checking  for  correctness  can  be  achieved. 

Another  interesting  problem  that  can  be  solved  with  our  scheme  is  that  of  joint 
computation  between  the  client  and  server.  In  that  problem,  there  is  a  fimction  f  with  two 
inputs  that  the  client  and  server  want  to  calculate.  One  input  is  provided  by  the  client  and 
one  input  is  provided  by  the  server.  The  calculation  should  be  done  is  such  a  way  that  the 
client  and  server  learn  nothing  about  the  other’s  input  except  what  they  can  deduce  from 
the  computed  value.  In  our  setting,  if  1  is  provided  by  the  server,  then  the  client  and  server 
to  compute  f(X,  1)  =  without  leaking  any  information  the  client  or  server  other  than 
what  they  can  deduce  from  the  value  of  X. 


1.4.  Hiding  Queries 

With  the  increasing  use  of  web  services,  many  new  challenges  concerning  data  security 
become  critical  in  various  application  domains.  Especially  in  mobile  computing,  where 
clients  are  generally  thin  and  have  limited  computation  power  or  memory,  a  server  (or  an 
oracle)  is  needed  to  do  the  computation  or  store  information  for  clients.  Generally,  this 
server  or  server  farm  has  very  strong  computation  capability  and  large  memory  space. 


Unfortunately,  such  an  oracle  may  not  always  be  trustworthy  and  clients  with  sensitive 
data  may  want  to  be  protected  from  malicious  oracles.  In  our  work  (publication[3]),  we 
developed  protocols  where  data  of  clients  are  stored  in  a  server's  database  in  a  way  that 
not  only  the  data,  but  also  properties  of  the  data,  and  the  data  structures  are  hidden  from 
the  server,  while  clients  can  still  efficiently  retrieve  data  based  on  available  data 
structures.  We  show  that  this  task  is  achievable  with  a  limited  communication  overhead, 
which  is  essential  in  low  bandwidth  (such  as  wireless)  distributed  systems.  This  protocol 
makes  possible  clients  to  outsource  their  sensitive  data  on  servers  without  any  prior  trust. 
In  this  protocol,  tree  nodes  are  encrypted  before  being  outsourced;  hence,  their  contents 
(if  the  nodes  are  XML  elements,  also  the  element  types)  are  hidden  from  the  untrusted 
data  store.  Each  time  a  user  wants  to  retrieve  a  node,  it  asks  for  a  set  of  nodes  called  the 
"redundancy  set"  including  the  target  node  and  additional  random  nodes.  The  redundancy 
set  hides  the  requested  node  from  the  server.  To  prevent  the  server  to  infer  the  locations 
of  the  nodes  through  repeated  trials,  after  each  node  is  retrieved,  the  node  is  swapped 
with  a  randomly  chosen  empty  node.  Thus,  the  node  moves  after  each  retrieval,  making 
any  correct  guessing  of  its  location  temporary  and  of  no  permanent  use.  The  size  of  the 
redundant  set  need  not  be  large  to  hide  long  paths.  Unlike  the  information  theoretic 
private  information  retrieval  schemes,  the  technique  requires  no  replication  of  the 
database  and  the  communication  cost  is  adaptable  and  generally  much  less  than  the  size 
of  the  database.  Compared  with  general  computationally  private  information  retrieval 
schemes,  our  technique  is  much  simpler  and  does  not  rely  on  any  cryptographic 
assumptions  except  for  the  ones  on  which  the  underlying  encryption  schemes  are  built. 
The  protocol  requires  legal  clients  to  have  access  to  keys  and  be  able  to  perform 
encryption  and  decryption  operations.  Where  encryption  and  decryption  constitute  heavy 
computation  costs  for  clients  with  very  limited  computation  power  and  memory,  we 
suggest  the  use  of  assistant  hardware,  such  as  smartcards,  to  reduce  the 
encryption/decryption  execution  costs  as  well  as  to  seciurely  disseminate  keys.  To  enable 
multiple  users  to  query  a  tree  simultaneously,  which  is  mandatory  for  an  open  and  public 
data  store,  we  also  devised  deadlock  free  concurrency  control  mechanisms.  Further  study 
of  the  security  guarantees  provided  by  this  protocol  requires  modeling  of  user  queries  and 
interaction  between  the  client  and  the  server.  At  the  high-level,  customer  retrieval  of  a 
redundancy  set  constitutes  a  call.  Each  query  then  can  be  represented  as  an  ordered  set  of 
calls.  Supposing  that  there  is  a  transport  layer  security  mechanism  that  hides  the  identity 
of  each  retrieval  request,  we  can  model  database/application  hosting  server's  view  of  data 
accesses  as  a  stream  of  calls  from  unknown  origins.  The  server  might  still  be  able  infer 
the  tree  structure  by  observing  the  call  stream,  guessing  which  calls  in  the  stream  belong 
to  a  single  query,  guessing  which  queries  are  the  same  or  similar,  and  then  looking  at  the 
intersection  of  the  redundant  sets  of  the  corresponding  calls  in  each  query.  In  order  to 
measure  the  amount  of  security  provided  by  the  system,  thus,  it  is  necessary  to 
understand  the  probability  with  which  the  server  can  use  these  intersections  to  break  the 
security: 

o  Step  1 :  Given  two  windows  of  calls  observed  by  the  server,  each  with  size  w, 
what  is  the  probability  of  finding  two  queries  from  that  are  the  same. 


o 


Step  2:  Given  two  windows  of  calls  observed  by  the  server,  each  with  size  w, 
that  are  known  to  contain  two  identical  queries,  what  is  the  probability  for  the 
server  to  identify  the  individual  calls  of  these  two  queries, 
o  Step  3:  Given  the  calls  of  two  identical  queries,  what  is  the  probability  for  the 
server  to  discover  the  query  path. 

Attacks  by  the  server  would  rely  on  the  intersection  property  described  above.  Intuitively, 
such  attacks  can  be  prevented  by  ensuring  that  intersections  do  not  reveal  much 
information.  This  is  achieved  by  modifying  the  protocols  such  that  the  redundant  sets 
intersect  at  multiple  nodes  and  by  inserting  appropriate  dummy/interfering  calls  which 
break  the  relationship  between  queries  and  intersecting  calls. 


1.5.  Book  Chapters 

We  wrote  two  book  chapters  as  part  of  this  effort  (publications  [1]  and  [5]).  Both  chapters 
survey  different  aspects  of  the  state  of  the  art  in  the  field.  Both  chapter  provide  a  good 
survey  of  existing  techniques  for  protecting  hosts  in  a  distributed  setting. 

In  the  chapter  by  Bazzi  and  Karablieh,  we  examined  the  requirements  for  developing 
reliable  E-commerce  applications  in  large  and  open  distributed  systems.  Our  emphasis 
was  on  mobile  agents  technology.  We  identified  general  requirements  and  security 
requirements  for  developing  E-commerce  application.  We  also  examined  security  threats 
that  result  from  the  use  of  agents.  We  considered  both  threats  to  the  agents  themselves 
from  hosts  on  which  they  execute  and  threats  to  the  hosts  fi’om  agents  executing  on  them. 
Threats  to  agents  fall  under  the  general  topic  of  our  grant  and  most  of  the  results  in  that 
area  are  not  developed  enough  for  practical  applications.  Threats  to  hosts  are  a  much 
better  understood  subject.  We  classified  techniques  for  dealing  with  threats  to  hosts  as 
either  compile-time  or  runtime  techniques.  The  techniques  we  surveyed  include  proof 
carrying  codes,  sand  boxing,  execution  monitoring,  code  signing,  state  appraisal,  and  path 
histories.  For  protecting  agents  from  malicious  hosts,  we  considered  clueless  agents, 
partial  result  encapsulation,  computing  with  encrypted  functions,  and  code  obfuscation. 
Finally,  we  considered  fault-tolerance  and  service  replication  as  a  technique  to  protect 
agents  against  malicious  hosts. 

In  the  chapter  by  Lin  and  Candan,  we  provided  an  overview  of  typical  security  challenges 
faced  by  distributed  application  hosting  services  and  surveyed  the  main  secruity  available 
mechanisms  and  the  open  challenges  at  the  different  tiers  in  the  information  management 
hierarchy.  This  chapter  also  discusses  hiding  access  structure  from  a  server  hosting 
encrypted  data.  All  the  topics  of  the  chapter  fall  xmder  the  general  topic  of  our  grant.  The 
chapter  also  discusses  access  control  and  a  variety  of  security  policies. 


1.6.  Significance  to  Field 


Our  results  are  the  only  results  that  we  are  aware  of  that  directly  apply  to  imperative 
programs.  Other  results  in  this  area  apply  to  Boolean  circuits  or  Boolean  predicates. 
Working  with  imperative  program  is  necessary  if  one  hopes  to  have  practical  applications 
for  the  work.  Also,  our  results  are  the  only  results  that  we  are  aware  of  that  apply  to  real 
and  complex  data.  Other  results  in  the  field  apply  to  data  inputs  that  are  integers  modulo 
some  prime  number. 

1.7.  Relevance  to  AF  Mission 

Enabling  secure  computation  outsourcing  will  enable  lightweight  clients  to  outsource 
computations  to  servers  that  can  be  potentially  compromised.  While  our  results  have 
limited  applicability,  they  are  an  important  step  towards  that  goal. 

1.7.  Potential  applications  to  AF  and  civilian  application  challenges 

The  main  application  is  computation  outsourcing.  Our  current  results  do  not  have  direct 
practical  applications. 


1.8.  Interactions/Transitions 

o  Professor  Bazzi  participated  in  a  NSF  CAREER  panel  and  reviewed  some 
security-related  proposals  (2000). 
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